Integer programming based heterogeneous CPU–GPU cluster schedulers for SLURM resource manager
نویسندگان
چکیده
منابع مشابه
Integer programming based heterogeneous CPU-GPU cluster schedulers for SLURM resource manager
We present two integer programming based heterogeneous CPU-GPU cluster schedulers, called IPSCHED and AUCSCHED, for the widely used SLURM resource manager. Our scheduler algorithms take windows of jobs and solve allocation problems in which free CPU cores and GPU cards are allocated collectively to jobs so as to maximize some objective functions. Our AUCSCHED scheduler employs an auction based ...
متن کاملSpeculation-aware Resource Allocation for Cluster Schedulers
Resource allocation and straggler mitigation (via “speculative” copies) are two key building blocks for analytics frameworks. Today, the two solutions are largely decoupled from each other, losing the opportunities of joint optimization. Resource allocation across jobs assumes that each job runs a fixed set of tasks, ignoring their need to dynamically run speculative copies for stragglers. Cons...
متن کاملResource Manager for Globus-Based Wide-Area Cluster Computing
In this paper, we present a new type of Globus resource allocation manager (GRAM) called RMF (Resource Manager beyond the Firewall) for wide-area cluster computing. RMFmanages computing resources such as cluster systems and enables utilization of them beyond the rewall in global computing environments. RMF consists of two basic modules, a remote job queuing system (Q system) and a resource allo...
متن کاملA Mixed Integer Programming Formulation for the Heterogeneous Fixed Fleet Open Vehicle Routing Problem
The heterogeneous fixed fleet open vehicle routing problem (HFFOVRP) is one of the most significant extension problems of the open vehicle routing problem (OVRP). The HFFOVRP is the problem of designing collection routes to a number of predefined nodes by a fixed fleet number of vehicles with various capacities and related costs. In this problem, the vehicle doesn’t return to the depot after se...
متن کاملSLURM: Simple Linux Utility for Resource Management
Simple Linux Utility for Resource Management (SLURM) is an open source, faulttolerant, and highly scalable cluster management and job scheduling system for Linux clusters of thousands of nodes. Components include machine status, partition management, job management, scheduling, and stream copy modules. This paper presents an overview of the SLURM architecture and functionality. 1 Overview Simpl...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of Computer and System Sciences
سال: 2015
ISSN: 0022-0000
DOI: 10.1016/j.jcss.2014.06.011